"""
数据预处理模块，从文件获取url并从url中获取疾病拼音名称，作为文件夹索引，在本地查找对应疾病的文件夹
"""


def split_lines(url):
    """
    分割文件的每行
    :return:返回疾病的拼音名字
    """
    return url.split()[1].split('/')[-2]


def get_disease_url(path):
    """
    从文件中读取疾病的url
    :param path: 疾病url的路径
    :return: 疾病url的生成器
    """
    with open(path, mode='r', encoding='utf-8') as f:
        return f.read().splitlines()


if __name__ == '__main__':
    path = r'D:\PycharmProjects\39_health_disease\disease_fuse_no_dup.txt'
    for item in get_disease_url(path):
        disease_name = split_lines(item)
        print(disease_name)
        # break
